Búsqueda | Portal Regional de la BVS

Using network science to examine audio-visual speech perception with a multi-layer graph.

Vitevitch, Michael S; Lachs, Lorin.

PLoS One ; 19(3): e0300926, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38551907

RESUMEN

To examine visual speech perception (i.e., lip-reading), we created a multi-layer network (the AV-net) that contained: (1) an auditory layer with nodes representing phonological word-forms and edges connecting words that were phonologically related, and (2) a visual layer with nodes representing the viseme representations of words and edges connecting viseme representations that differed by a single viseme (and additional edges to connect related nodes in the two layers). The results of several computer simulations (in which activation diffused across the network to simulate word identification) are reported and compared to the performance of human participants who identified the same words in a condition in which audio and visual information were both presented (Simulation 1), in an audio-only presentation condition (Simulation 2), and a visual-only presentation condition (Simulation 3). Another simulation (Simulation 4) examined the influence of phonological information on visual speech perception by comparing performance in the multi-layer AV-net to a single-layer network that contained only a visual layer with nodes representing the viseme representations of words and edges connecting viseme representations that differed by a single viseme. We also report the results of several analyses of the errors made by human participants in the visual-only presentation condition. The results of our analyses have implications for future research and training of lip-reading, and for the development of automatic lip-reading devices and software for individuals with certain developmental or acquired disorders or for listeners with normal hearing in noisy conditions.

Asunto(s)

Percepción del Habla , Humanos , Percepción del Habla/fisiología , Percepción Visual/fisiología , Lectura de los Labios , Habla , Lingüística

Interpreting tone of voice: musical pitch relationships convey agreement in dyadic conversation.

Okada, Brooke M; Lachs, Lorin; Boone, Benjamin.

J Acoust Soc Am ; 132(3): EL208-14, 2012 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-22979834

RESUMEN

Previous research has found that the musical intervals found in speech are associated with various emotions. Intervals can be classified by their level of consonance or dissonance-how pleasant or unpleasant the combined tones sound to the ear. Exploratory investigations have indicated that in an agreeable conversation, the pitches of the last word in an utterance and the first word of a conversation partner's utterance are consonantly related; in a disagreeable conversation, the two pitches are dissonantly related. The present results showed that the intervals between the tonics of the utterances in a conversation corresponded to the agreement between interlocutors.

Asunto(s)

Música , Percepción de la Altura Tonal , Acústica del Lenguaje , Percepción del Habla , Calidad de la Voz , Comunicación , Señales (Psicología) , Emociones , Humanos , Medición de la Producción del Habla

Specification of cross-modal source information in isolated kinematic displays of speech.

Lachs, Lorin; Pisoni, David B.

J Acoust Soc Am ; 116(1): 507-18, 2004 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-15296010

RESUMEN

Information about the acoustic properties of a talker's voice is available in optical displays of speech, and vice versa, as evidenced by perceivers' ability to match faces and voices based on vocal identity. The present investigation used point-light displays (PLDs) of visual speech and sinewave replicas of auditory speech in a cross-modal matching task to assess perceivers' ability to match faces and voices under conditions when only isolated kinematic information about vocal tract articulation was available. These stimuli were also used in a word recognition experiment under auditory-alone and audiovisual conditions. The results showed that isolated kinematic displays provide enough information to match the source of an utterance across sensory modalities. Furthermore, isolated kinematic displays can be integrated to yield better word recognition performance under audiovisual conditions than under auditory-alone conditions. The results are discussed in terms of their implications for describing the nature of speech information and current theories of speech perception and spoken word recognition.

Asunto(s)

Reconocimiento en Psicología , Percepción del Habla/fisiología , Percepción Visual/fisiología , Pliegues Vocales/fisiología , Estimulación Acústica , Adolescente , Adulto , Fenómenos Biomecánicos , Expresión Facial , Femenino , Humanos , Masculino , Espectrografía del Sonido , Vocabulario

Cross-modal source information and spoken word recognition.

Lachs, Lorin; Pisoni, David B.

J Exp Psychol Hum Percept Perform ; 30(2): 378-96, 2004 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-15053696

RESUMEN

In a cross-modal matching task, participants were asked to match visual and auditory displays of speech based on the identity of the speaker. The present investigation used this task with acoustically transformed speech to examine the properties of sound that can convey cross-modal information. Word recognition performance was also measured under the same transformations. The authors found that cross-modal matching was only possible under transformations that preserved the relative spectral and temporal patterns of formant frequencies. In addition, cross-modal matching was only possible under the same conditions that yielded robust word recognition performance. The results are consistent with the hypothesis that acoustic and optical displays of speech simultaneously carry articulatory information about both the underlying linguistic message and indexical properties of the talker.

Asunto(s)

Reconocimiento en Psicología , Percepción del Habla , Vocabulario , Adulto , Femenino , Humanos , Masculino , Distribución Aleatoria , Tiempo de Reacción , Espectrografía del Sonido , Factores de Tiempo

Crossmodal Source Identification in Speech Perception.

Lachs, Lorin; Pisoni, David B.

Ecol Psychol ; 16(3): 159-187, 2004.

Artículo en Inglés | MEDLINE | ID: mdl-21544262

RESUMEN

Four experiments examined the nature of multisensory speech information. In Experiment 1, participants were asked to match heard voices with dynamic visual-alone video clips of speakers' articulating faces. This cross-modal matching task was used to examine whether vocal source matching can be accomplished across sensory modalities. The results showed that observers could match speaking faces and voices, indicating that information about the speaker was available for cross-modal comparisons. In a series of follow-up experiments, several stimulus manipulations were used to determine some of the critical acoustic and optic patterns necessary for specifying cross-modal source information. The results showed that cross-modal source information was not available in static visual displays of faces and was not contingent on a prominent acoustic cue to vocal identity (f0). Furthermore, cross-modal matching was not possible when the acoustic signal was temporally reversed.

Talker and lexical effects on audiovisual word recognition by adults with cochlear implants.

Kaiser, Adam R; Kirk, Karen Iler; Lachs, Lorin; Pisoni, David B.

J Speech Lang Hear Res ; 46(2): 390-404, 2003 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-14700380

RESUMEN

The present study examined how postlingually deafened adults with cochlear implants combine visual information from lipreading with auditory cues in an open-set word recognition task. Adults with normal hearing served as a comparison group. Word recognition performance was assessed using lexically controlled word lists presented under auditory-only, visual-only, and combined audiovisual presentation formats. Effects of talker variability were studied by manipulating the number of talkers producing the stimulus tokens. Lexical competition was investigated using sets of lexically easy and lexically hard test words. To assess the degree of audiovisual integration, a measure of visual enhancement, R(a), was used to assess the gain in performance provided in the audiovisual presentation format relative to the maximum possible performance obtainable in the auditory-only format. Results showed that word recognition performance was highest for audiovisual presentation followed by auditory-only and then visual-only stimulus presentation. Performance was better for single-talker lists than for multiple-talker lists, particularly under the audiovisual presentation format. Word recognition performance was better for the lexically easy than for the lexically hard words regardless of presentation format. Visual enhancement scores were higher for single-talker conditions compared to multiple-talker conditions and tended to be somewhat better for lexically easy words than for lexically hard words. The pattern of results suggests that information from the auditory and visual modalities is used to access common, multimodal lexical representations in memory. The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.

Asunto(s)

Implantes Cocleares , Sordera/terapia , Personas con Deficiencia Auditiva , Percepción del Habla , Adulto , Anciano , Análisis de Varianza , Estudios de Casos y Controles , Señales (Psicología) , Sordera/fisiopatología , Femenino , Humanos , Masculino , Persona de Mediana Edad

AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS.

Kirk, Karen Iler; Pisoni, David B; Lachs, Lorin.

Proc Int Conf Spok Lang Process ; 2002: 1689-1692, 2002.

Artículo en Inglés | MEDLINE | ID: mdl-25364781

RESUMEN

The present study examined how prelingually deafened children and postlingually deafened adults with cochlear implants (CIs) combine visual speech information with auditory cues. Performance was assessed under auditory-alone (A), visual- alone (V), and combined audiovisual (AV) presentation formats. A measure of visual enhancement, RA, was used to assess the gain in performance provided in the AV condition relative to the maximum possible performance in the auditory-alone format. Word recogniton was highest for AV presentation followed by A and V, respectively. Children who received more visual enhancement also produced more intelligible speech. Adults with CIs made better use of visual information in more difficult listening conditions (e.g., when mutiple talkers or phonemically similar words were used). The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.

Use of Partial Stimulus Information by Cochlear Implant Users and Listeners with Normal Hearing in Identifying Spoken Words: Some Preliminary Analyses.

Lachs, Lorin; Weiss, Jonathan W; Pisoni, David B.

Volta Rev ; 102(4): 303-320, 2000.

Artículo en Inglés | MEDLINE | ID: mdl-21686052

RESUMEN

An error analysis of the word recognition responses of cochlear implant users and listeners with normal hearing was conducted to determine the types of partial information used by these two populations when they identified spoken words under auditory-alone and audiovisual conditions. The results revealed that the two groups used different types of partial information in identifying spoken words under auditory-alone or audiovisual presentation. Different types of partial information were also used in identifying words with different lexical properties. In our study, however, there were no significant interactions with hearing status, indicating that cochlear implant users and listeners with normal hearing identify spoken words in a similar manner. The information available to users with cochlear implants preserves much of the partial information necessary for accurate spoken word recognition.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA